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Abstract 

A system, methods, and algorithm for content-based image retrieval and recognition system, 
useful in all types images and image formats. An image(s) or an image segment(s), which is 
specified by the user in two clicks (the first in the upper-left corner and the second in the bottom- 
right corner), specifies the content-based sample. The sample image(s) is used to teach the 
system what to look for via the ABM (Attrasoft Boltzmann Machine) algorithm and APN 
(Attrasoft PolyNet) algorithm; the system then searches through one or many directories, which 
N is specified by the user, and presents the research results. The search result consists of pairs 
IB matched image and a Weight (score), which specifies the similarity between the sample and 
W matching images. These weights are also being used to classify images in the cases of the 
f classification problem. The users are able to view the retrieved images in the result via a single 
* click When the algorithm is implemented as a software component, the system integration will 
follow the specification of the "Attrasoft Image Verification and Identification Application 
Programming Interface (I VI- API)". 
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n TECHNICAL FIELD 



ru 



The present invention relates generally to image retrieval and image recognition, and more 
particularly related to a system, methods, and algorithms of content-based image retrieval and 
recognition system. Within such a system, the image(s) to be retrieved/recognized is not 
preprocessed with the association of key words (meta-data). This system allows the user of an 
image retrieval/recognition system, such as software together with a computer, network server, 
or web server etc, to define a searching criteria by using an image(s), a segment of an image(s), a 
directory containing images or combinations of the above. This system will return the result, 
which contains pairs of the matched image and similarity. The user can see the matched images 
in a single click. 

This invention can be used in image verification (1:1 matching, binary output: yes/no), image 
identification (1:N matching, single output to indicate a classification), image search or retrieval 
(1:N matching, multiple output), and image classification (N:l or N:N matching). For simplicity, 
we will only use the word, retrieval. 
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BACKGROUND OF THE INVENTION 



In certain types of content-based images retrieval/recognition systems, the central task of the 
management system is to retrieve images that meet some specified constraints. 

Most image-retrieval methods are limited to the keyword-based approach. In this approach, keywords and 
the images together form a record in a table. The retrieval is based on the keywords in much the same 
way as the relational database. (Example: Microsoft Access). 

The user operation is generally divided into two phases: the learning phase and the search/recognition 
phase. In the learning phase, various types of processes, such as image preprocessing and image filtering 
are applied to the images. Then the images are sent to a recognition module to teach the module the 
characteristics of the image. The learning module can use various algorithms to learn the sample image. 
In the search/retrieval phase, the recognition module decides the classification of an image in a search 
directory or a search database. 

A very small number of commercially available products exist which perform content-based 
image retrieval. 

13 Informix Internet Foundation.2000 is an object-relational database management system 
P (ORDBMS), which supports non-alphanumeric data types (objects). IIF2000 supports several 
N DataBlade modules including the Excalibur Image DataBlade module to extend its retrieval 
j** capabilities. DataBlade modules are server extensions that are integrated into the core of the 
|| database engine. The Excalibur Image DataBlade is based on technology from Excalibur 
5 Technologies Corporation, and is co-developed and co-supported by Informix and Excalibur. 

The core of the DataBlade is the Excalibur Visual retrievalWare SDK. The Image DataBlade 
p module provides image storage, retrieval, and feature management for digital image data. This 
fjj includes image manipulation, I/O routines, and feature extraction to store and retrieve images by 
pi their visual contents. An Informix database can be queried by aspect ratio, brightness, global 
Q colour, local colour, shape, and texture attributes. An evaluation copy of IIF2000 and the 
P Excalibur Image DataBlade module can be downloaded from www.informix.com/evaluate/. 

ry 

Match is a content-based image retrieval system developed for the Windows operating system. 
The software was developed by Mario M. Westphal and is available under a shareware license. 
Match can query an image database by the following matching features: colour similarity, 
colour and shape (Quick), colour and shape (Fuzzy), colour percentage, and colour distribution. 
A fully functional 30-day evaluation copy is available for users to assess the software's 
capabilities and can be downloaded from www.mwlabs.de/download.htm The shareware version 
has a 2000 limit on the number of images that can be added to a database. A new version of the 
software was released on the 18th February 2001. 

The Oracle8/ Enterprise Server is an object relational database management system that includes 
integral support for BLOBs. This provides the basis for adding complex objects, such as digital 
images, to Oracle databases. The Enterprise release of the Oracle database server includes the 
Visual Information retrieval (VIR) data cartridge developed by Virage Inc. OVIR is an extension 
to Oracle8/ Enterprise Server that provides image storage, content-based retrieval, and format 
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conversion capabilities through an object type. An Oracle database can be queried by global 
color, local color, shape, and texture attributes. An evaluation copy of the Oracle8z Enterprise 
Server can be downloaded from otn.oracle.com. 



SUMMARY OF THE INVENTION 

The present invention is different from Informix database where images can be queried by aspect 
ratio, brightness, global colour, local colour, shape, and texture attributes. The present invention 
is different from Imatch where images can be queried by colour similarity, colour and shape 
(Quick), colour and shape (Fuzzy), colour percentage, and colour distribution. The present 
invention is different from the Oracle8/' Enterprise Server where images can be queried by color, 
local color, shape, and texture attributes. 

The present invention is unique in its sample image, control process, control parameters, and 
algorithms. The current algorithms do not use methodologies deployed in the above systems. In 
M particular, the following parameters are not used: aspect ratio, brightness, global colour, local 
g colour, shape, colour similarity, colour and shape (Quick), colour and shape (Fuzzy), colour 
9 percentage, and colour distribution, local color, shape, and texture attributes. The present 
^ invention has nothing in common with any existing system. 

HI 

Even the current invention is applied to images, the algorithms in the invention can be applied to 
J? other types of data, such as sound, movie, . . . 

* 

0 1. Process 

Hi 

W The present invention is a content-based image retrieval/recognition system, where users specify 
P an image(s) or segments); adjust control parameters of the system, and query for all matching 
5 images from an image directory or database. The user operation is generally divided into two 
?y phases: learning phase and search/recognition phase. In the learning phase, various types of 
processes, such as image preprocessing, image size reduction, and image filtering are applied to 
the images. Then the images are send to a recognition module to teach the module the 
characteristics of the image as specified by an array of pixels. Each pixel is defined by an 
integer, which can have any number of bits. The learning module can use ABM or APN learning 
algorithms to learn the sample image. Both the algorithms will be listed in the present invention. 
In the search/retrieval phase, the recognition module decides the classification of an image in a 
search directory or a search database. 

In a retrieval/recognition system, a "training" for the system or "learning" by the system is to 
teach the system what characteristics of an image, or a segment of an image (key) to look for. A 
system operator completes this step by specifying the sample image(s); specifying the parameters 
and clicking one button, the "training" button, which appears in the graphical user interface of 
the system. A "retraining" by the system is to teach the system what characteristics of images to 
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look for, after the system is already trained. Training and retraining together allows the system to 
learn from many sample image(s) and segments) simultaneously. 

A "search" or "retrieval" is to look for matching images from an image source such as, directory, 
many directories, subdirectories, network, Internet, or database, etc. A system operator 
completes this step by specifying the image source such as search directory(s), specifying the 
parameters and clicking one button, the "searching" button, which appears in the graphical user 
interface of the system. The results can be displayed within the software systems or displayed in 
a program created by the system. Two particular applications are image verification (1:1 
matching, binary output: yes/no) and image identification (1:N matching, single output to 
indicate a classification). 

A "classification" or "recognition" is to repeat training and search for each category of images. 
At the end, a system operator clicks one button, the "classification" button, which appears in the 
graphical user interface of the system. The results can be displayed within the software systems 
or displayed in a program created by the system. Classification is an N: N matching with a single 
output to indicate a classification. 

The parameters and settings of a particular operation can be saved and recalled later. Clicking a 
button, cut and paste, open files, or typing can achieve recalling a saved operation. The saved 
results' are called "batch code". The "Batch" buttons provide means to execute these saved batch 
codes. 

A "process" is a sequence of training and searching, or a classification, or a specification of a 
batch code and execution of a batch code. They are further divided into a search process, a 
classification process, and a batch process. 

After the operator completes a process, the results consists of a list of pairs; the pairs consist of 
the matched image and the "weight", which reflects how closely the selected image matches the 
sample image(s). This list can be sorted or unsorted. This list provides the link to the matched 
images so the match images can be viewed with a single click. 

"System integration" is to combine a software component, which is an implementation of this 
invention, with an application interface. 

The search process, which is applicable to retrieval, verification, and identification, is: 

1 . Enter key image into the system; 

2. Set training parameters and click the training button to teach the system what to look for; 

3. Enter search-directory(s); 

4. Set search parameter(s), and click the search button; 

5. The system output is a list of names and weights: 

• The weight of an image is related to the characteristics you are looking for (the weight is 
similar to an Internet search engine weight); 

• Click the name of each image and an image will pop up on the screen. 



The classification process is: 



1 . Enter key image into the system; 

2. Set training parameters and click the training button to teach the system what to look for; 

3. Enter search-directory(s); 

4. Set search parameters), and click the search button; 

5. Repeat the above process for each class and then click the "Record" button. At the end, click 
the "Classification" button. The output web page will first list the sample images for each 
class. Then it will list: 

• An image link for each image in the search directory; 

• The classification weights of this image in each search; and 

• The classification of this image as a link. 

The batch process is: 

1. Provide the batch code to the system, which includes: 

• Click the save button to save the current setting, including key(s), search directory(s), and 
H» parameters into a batch code. 

P. • Click a file button to recall one of the many batch codes saved earlier. 
P • Cut and paste or simply type in a batch code by keyboard. 
^ 2. Click batch button to execute the code. 

An integration process is to combine a software component, which is an implementation of this 
S invention, with an application interface. This invention also specifies a user-graphical-interface 
for the integration. 

O 

HJ 2. Parameters 

P The search, classification, and batch processes require a set of parameters. All the parameters can 
^ be specified in the system user interface, either through clicking buttons or through Windows. 

The parameters are specially related to the ABM and APN algorithms, which will be claimed in 

this patent. 

The "Area of Interest" specifies an image segment, which is specified by 4 numbers: the coordinates of 
the upper-left corner and the bottom-right corner. 

The "internal representation" specifies the dimensions of a pixel array used for computation, 
which may or may not be the actual image pixel array. 

The "Background" or "Background filter" selects an image-processing filter the pixel array must 
pass through before entering the learning component of the system. 

The "Symmetry" represents similarity under certain types of changes, such as intensity, 
translation symmetry, Scaling, Rotation, oblique, combined rotation and scaling or any 
combination thereof. 
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The "Rotation Types" specify the range of rotation if the rotation symmetry is used. Examples 
are 360°-rotations, -5° to 5° rotations, and -10° to 10° rotations, or other settings that fit the 
user's need. 

The "Reduction Type" specifies the method used when reducing a large image pixel array to a 
smaller pixel array. 

The "Sensitivity" deals with the sample segment size; high sensitivity is for small segment(s) and 
low sensitivity is for large segment(s). 

The "Blurring" measures the distortion due to data compression, translation, rotation, scaling, 
intensity change, and image format conversion. 

The "Shape Cut" is to eliminate many images that have different shapes from the sample 
segment. 

The "External Weight Cut" is to list only those retrieved images with weights greater than a 
certain value. The weight Cut is an integer greater than or equal to 0. There is no limit how large 
this integer can be. The "Internal Weight" Cut plays a similar role as the External Cut in a 
percent value rather than an absolute weight value. 

The "Image Type" specifies the learning component whether to treat the pixel array as black and 
white images or a color image. It also instructs the learning component whether to use a 
maximum value, integration, or both. 

The "L/S Segment" (Large/Small segment) specifies the system where to focus when searching 
images. 

The "Short/Long" search specifies an image source such as whether to search one directory or 
many directories. 

The "Short Cut" is a Scrollbar to select an integer between 0 and 99; each integer is mapped to a 
set of predefined settings for the parameters. 

The "Border Cut" controls the portions of images to be used in the image recognition. 

The "Segment Cut" controls the threshold used to reduce an image into an internal 
representation. 

3. System Layout 

Attrasoft Component-Object structure consists of three layers: 
• Application Layer 



• Presentation Layer 

• ABM Network Layer 



The ABM Network Layer has two algorithms to be claimed in the present invention: 

• ABM (Attrasoft Boltzmann Machine); 

• Attrasoft PolyNet (APN): multi-valued ABM. 

This layer is responsible for learning and classification. 

The Presentation Layer is an interface between the ABM net layer and the user interface layer. 
There are two types of data used by the systems: user data or application data, and ABM neural 
data. ABM networks use ABM neural data. User data depends on the application. The 
presentation layer converts the image data into neural data used by the ABM layer component. 

The Application Layer is the front-end graphical user interface, which the users see directly. This 
layer collects all parameters required for necessary computation. 

M 

M 
%.? 

m 4. Algorithms 

[n The ABM layer deploys two algorithms, ABM and APN. The ABM and APN algorithms consist 

I of a combination of Markov Chain Theory and the Neural Network theory. Both theories are 

T well known. The ABM and APN algorithms are newly invented algorithms, which have never 

O been published. 

W The following terms are well known: Markov chain, sate of Markov chain, invariant distribution. 

9 The basic flow chart for ABM and APN algorithms are: 

m 

1. Combine an image and its classification into a vector. 

2. All such together form a mathematical configuration space. Each point in such a space is 
called a state. 

3. A Markov chain exists in such a space where the state of the configuration space is a state 
of the Markov chain. 

4. The Markov chain will settle on its invariant distribution. A distribution function is 
deployed to describe such a distribution. In particular, such distribution function 
classifies the images. 

5. The construction of such a Markov chain is by a particular type of neural network, called 
ABM network or APN network. This type of neural net satisfies 3 features: (1) fully 
connected; (2) the order of the neural net is the same as the number of neurons in the 
network, i.e. the number of connections is an exponential function of the number of 
neurons;' and (3) the connections follow particular algorithms, known as ABM and APN 
algorithms. 
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The Step 4 of the above is defined as follows: 

Let x be an image, and let a, b be two classes; then the two possible vectors are (x, a) and (x, b). 
Let a distribution function be z = F (y), where y is a vector. If y = (x, a), z = zl; and y = (x, b), z 
= z2, then the probability of x in class a is zl and the probability of x in class b is z2. The result 
will be {(x, a, zl), (x, b, z2)}. The users will see results like this directly in the output of the 
system. 

In the ABM or APN algorithms, content-based image retrieval and image recognition are 
basically the same problem; therefore, they can be converted from one to the other. To convert 
from an image search problem to an image recognition problem, one query is required for each class. To 
see whether an image, say B, is in class A, you first train ABM with all images in class A, then try to 
retrieve image B. If image B is not retrieved, then image B is not in class A. If image B is retrieved only 
for class A, then image B is in class A. If image B is retrieved for several classes, the class with the 
largest relative probability is the one to which image B belongs. Image search is an image classification 
problem with only 1 class. 

ABM is a binary network. APN is a multi-valued network. 



5. Components and Application-Programming Interface 

Software components can be isolated to be attached to different front-end systems. This can be 
done with ABM neural layer alone, or both ABM layer and presentation layer. The ABM layer 
component is a core of the present invention. The value of such a sub-system is the same as the 
whole system. 

This invention also defines the application-programming interface API), which specifies the 
system integration. This API is called I VI- API. 

BRIEF DESCRIPTION OF VIEWS OF THE DRAWING 

Figure 1 shows the algorithm of the Search Process, which is applicable for image verification, 
identification, and retrieval. 

Figure 2 shows the algorithm of the Classification Process. 
Figure 3 shows the algorithm of the Batch Process. 
Figure 4 shows a 3 -layer internal Architecture. 
Figure 5 shows the ABM Neural Layer Overview. 
Figure 6 shows the APN Neural Layer Overview. 
Figure 7 shows the Presentation Layer Overview. 



Figure 8 lists the ABM Training Algorithm. 

Figure 9 lists the APN Training Algorithm. 

Figure 10 lists the ABM Recognition Algorithm. 

Figure 1 1 lists the ABM Recognition Algorithm. 

Figure 12 shows a sample User Interface of the Present Invention. 

Figure 13 shows a sample Key Input for the Present Invention. 

Figure 14 shows a sample Search Output of the Present Invention. The search output is a list of 
pairs. 

Figure 15 shows a sample Classification output of the Present Invention. The classification 
output is a list of triplets. 

DETAILED DESCRIPTION OF THE DISCLOSED EMBODYMENT 

Preferred Embodiment of the Search System 

An image search/classification constructed in accordance with the preferred embodiment 
comprises a computer-based workstation including monitor, keyboard and mouse, a content- 
based image retrieval software system and a source of images. 

The source of the images may be on the local drive, network or the Internet. The source is 
connected to the workstation. The source of images may be accessed directly via open files, or 
indirectly, such as going into a file to find the images or going into a database application to find 
the images, etc. 

The preferred workstation can be a PC or any other type of computers, which connects to a data 
source. 

The preferred content-based image retrieval software system is any software, which has ABM or 
APN algorithm as a component. It can be a Window-based system, or any other operating system 
based systems, or Internet based systems. 

Overview of the ABM Algorithm 

The following terms are well known: synaptic connection or connection. 
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The basic flow chart for ABM algorithm is: 



1 . Create an ABM net with no connections; 

2. Combine an image and its classification into an input vector. 

3 . Impose the input vector to the learning module. 

4. The ABM neural connections are calculated based on the input vector. Let N be the 
number of neurons; the order of connections can be up to N and the number of 
connections can be 2**N, where ** represent the exponential function. 

5. The Markov chain is formed after the connections are established. This Markov chain 
will settle on its invariant distribution. A distribution function is deployed to describe 
such a distribution. 

6. This distribution function, once obtained, can be used to classify images. This will 
produce triplets of image, class, and weight. Image retrieval and classification are two 
different sides of the same token. 

7. These triplets of image, class, and weight can be viewed as the results of the 
classification process. For the search process, a doublet of image and weight are 
displayed. The second part of the triple is omitted because the search problem has only 
one class. 



Overview of the APN Algorithm 

The basic flow chart for APN algorithm is: 

1 . Create an APN neural net with no connections; 

2. Combine an image and its classification into an input vector. 

3 . Impose the input vector to the learning module. 

4. The APN neural connections are calculated based on the input vector. Let N be the 
number of neurons; the order of connections can be up to N and the number of 
connections can be 2**N, where ** represent the exponential function. 

5. A mapping over each connection is established. Let K be a number of neurons in a K 
order connection, where K is less than or equal to N, then this will be a K to K mapping, 
i.e. the domain of the mapping has K integers and the range of the mapping has K 
integers. 

6. The K-elements mapping is changed to N-element mapping by adding (N - K) pairs of 0 
to 0 relations for each of the neurons not in the set K. By taking the domain of this 
mapping away, the range of this mapping forms a vector, APN connection vector. 

7. The Markov chain is formed after the connections are established. This chain will settle 
on its on its invariant distribution. A distribution function is deployed to describe such a 
distribution. 

8. This distribution function, once obtained, can be used classify images. This will produce 
triplets of image, class, and weight. 

9. Comparing the input-vector and the APN-connection-vector modifies this weight. This 
will produce a new set of triplets of image, classification, and weight. 
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10. These triplets of image, class, and weight can be viewed as the results of the 
classification process. For the search process, a doublet of image and weight are 
displayed. The second part of the triple is omitted because the search problem has only 
one class. 



User Interface Layer of software for implementation of ABM and APN 
Algorithm 

There are three major operations: 

• Search or retrieval; 

• Classification; and 

• Batch. 

These are the principle modes of the system that runs on the workstation. The software executed 
in these three modes can have various user interfaces, such as in Windows environment or the 
|? web environment, etc. The user interface collects necessary information for the computation. 

H Other than the key and the a source of images, the user interface may or may not pass the 
jg following information to the next layer: 

pi 

2 The "Area of Interest" specifies an image segment by two clicks. These two clicks generate 4 numbers 

yp the coordinates of the upper-left corner and the bottom-right corner. 

P The "internal representation" specifies the dimensions of a pixel array used for computation, 

PJ which may or may not be the actual image pixel array. 

5 

jl The "Background" or "Background filter" selects an image-processing filter the pixel array must 

J pass through before entering the learning component of the system. The interface will be 

1 u responsible for selecting one of many available filters. 

The "Symmetry" represents similarity under certain types of changes, such as intensity, 
translation symmetry, Scaling, Rotation, oblique, combined rotation and scaling or any 
combination thereof. For the translation symmetry, this is implemented by physically translating 
the sample image to all possible positions. The similar methods can be applied to other 
symmetries. 

The "Rotation Types" specify the range of rotation if the rotation symmetry is used. Examples 
are 360°-rotations, -5° to 5° rotations, and -10° to 10° rotations, or other settings that fit the 
user's need. 

The "Reduction Type" specifies the method used when reducing a large image pixel array to a 
smaller pixel array. 
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The "Sensitivity" deals with the sample segment size; high sensitivity is for small segment(s) and 
low sensitivity is for large segment(s). This is a method to limit the relevant neural connections. 
When ABM net, xl, is trained, there will be certain connections. All possible connections 
together form a space, HI. For the ABM net with N neurons, such a space will have a maximum 
of 2**N point, where ** is the exponential function. Each trained ABM net will have a set hi, 
representing non-zero connections. When deciding whether an image, 12, in a search directory is 
a match to the current sample image, II, this image 12 can be turned around to train the new but 
similar ABM neural net, x2. The will generate a set of connections, h2. Similarity determines a 
maximum distance, d, either using the Hausdorff distance or LI distance or L2 distance. In the 
connection space, starting from the connection set, h2, of the new ABM net, after applying this 
new distance, d, a new set, h3, is obtained. Obviously the smaller this distance, d, is, the smaller 
this new set, h3, will be. This new set, h3, is then transformed back to hi. Any point in hi but 
not in h3 will be considered "too far" and therefore is set to 0 for the current image, 12, in the 
search directory. This reduction in the connections space is determined by the sensitivity. 

The "Blurring" measures the distortion due to data compression, translation, rotation, scaling, 
intensity change, and image format conversion. This method expands an image in the search 
directory from a single point to a set as follows. All possible images together form a space, the 
image space. An image is a point in such a space. When deciding whether an image, 12, in a 
search directory is a match to the current sample image, II, this image 12 can be turned a small 
set around the 12. Let the set be IS2. Blurring determines a maximum distance, d, either using the 
Hausdorff distance or LI distance or L2 distance. In the image space, starting from the 12, after 
applying this new distance, d, a new sphere set, IS2, is obtained. Obviously the smaller this 
distance, d, is, the smaller this new set, IS2, will be. Now any point in this set, IS2, is just as 
good as 12. This expansion in the image space is determined by the Blurring. 

The "Shape Cut" is to eliminate many images that have different shapes from the sample 
segment. All possible images together form a space, the image space. An image is a point in such 
a space. When deciding whether an image, 12, in a search directory is a match to the current 
sample image, II, the distance between II and 12, d, can be determined, either using the 
Hausdorff distance or LI distance or L2 distance. If this distance, d, is larger than a 
predetermined distance, D, a mismatch can be declared without going through the ABM neural 
net. This predetermined distance, D, is set by the "Shape Cut" parameter. 

The "External Weight Cut" is to list only those retrieved images with weights greater than a 
certain value. The weight Cut is an integer greater than or equal to 0. There is no limit how large 
this integer can be. 

The "Internal Weight Cut" plays a similar role as the "External Cut" in a percent value rather 
than an absolute weight value. 

The "Image Type" specifies the ABM or APN algorithm. It also instructs the neural layer 
component how to compute the weights. The weight can be computed by using the invariant 
function of the Markov chain, or integration all contributions in the time evolution of the Markov 
chain, with or without reaching the invariant distribution. 
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The "L/S Segment" (Large/Small segment) specifies the system where to focus when searching 
images. Please refer to the similarity to understand the set of contributing connections, i.e. not 
every connection is a contributing connection. Small and Large segments deploy different scales 
in the determining the set of connections. 

The "Short/Long" search specifies an image source such as whether to search one directory or 
many directories. 

The "Short Cut" is a Scrollbar to select an integer between 0 and 99; each integer is mapped to a 
set of predefined settings for the parameters. 

The "Border Cut" is to eliminate the border sections of images. This parameter controls the 
percentage of images to be eliminated before entering consideration. 

The "Segment Cut" is best illustrated by examples. Assume 1 400x400 image is reduced to 
100x100 internal representation, as set by the parameter "Internal Representation"; then 16 
original pixels will be reduced into 1 pixel. The new value of the single pixel is determined by 
the parameter "Reduction Type". The "Segment Cut' sets a threshold: if the number of non-zero 
pixels is greater than the threshold, the pixel will have a non-zero value; otherwise, the pixel will 
have a zero value. 

Presentation Layer of software for implementation of ABM and APN 
Algorithm 

The presentation layer transforms the image data to neural data. The procedure includes: 

• Open files from the image source; 

• Decode the image into pixels arrays; 

• Process images with a filter; 

• Reduce the size of images to an internal representation. The users can arbitrarily choose 
the internal representation of the images. Such reduction can be based on individual 
images on a case-by-case reduction, or deploy the same reduction factor across to all 
images. 

• In the case where many pixels in an image have to be combined into a new pixel before 
leaving this layer, the user can choose a reduction type such as taking average, maximum, 
minimum, or deploy a threshold. 

• Pass the image array to the next layer. 

ABM Layer of software for implementation of ABM and APN Algorithm 

This Upper level of this layer has two branches: 
• Training Objects 
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• High level training class 

• Low level training class and 

• Symmetry class 

• Recognition Objects 

• High level recognition class 

• Low level recognition class 

This lower level of this layer has only one class, the memory management class. 

The purpose of the memory management class is to claim memory space from RAM, 64K at a time. 
This memory space will be used for storing the connections. It also returns the unnecessary space back to 
the operating system of the computer. 

The low level training object is to provide all necessary functions used by the high level training 
class. 

Q The symmetry object is to implement the symmetry defined earlier. 

5 

N The high level training class incorporates symmetry and implements the ABM or APN 
fj algorithm. The "image Type" parameter in the user interface will determine which algorithm will 

?y be use. 

lO 

P ABM Training Algorithm is: 

0 

ry 1. Delete the existing ABM connections; 

nj 2. Combine an image and its classification into an input vector. 

Q 3. The ABM neural connections are calculated based on the input vector. Let N is the number 
p of neurons, these connections can be up to the order of N. The image is randomly breaking 
down into a predefined number of pieces. 

4. Let an image piece, pi, have K = (kl + k2) pixels, where K is an integer. After imposing the 
pixel vector to the ABM net, kl is the number of neurons excited and k2 is the neurons of 
neurons grounded. A neural state vector can be constructed to represent such a configuration, 
which kl components being 1 and k2 components being 0. 

5. All such vectors together form a space, the connection space. A distance, either the 
Hausdorff distance or LI distance or L2 distance can be defined in this space. Such a 
definition of a distance allows all possible connection vectors to be classified via a distance 
from pi. Many vectors will be in a group with distance 1 from pi. Many vectors will be in a 
group with distance 2 from pi, ... 

6. The connection represented by pi is assigned the largest synaptic connection weight. Those 
connections in the distance 1 group will have smaller weights, .... After a certain distance, 
the connection weights will be 0, or there will be no connections. The present invention 
covers all possible combinations of such a generating method. 

7. The Markov chain is formed after the connections are established. 
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APN Training Algorithm is: 



1 . Delete the existing ABM connections; 

2. Combine an image and its classification into an input vector. 

3. The ABM neural connections are calculated based on the input vector. Let N is the 
number of neurons, these connections can be up to the order of N. The image is randomly 
breaking down into a predefined number of pieces. 

4. Let an image piece, pi, have K = (kl + k2) pixels, where K is an integer. After imposing 
the pixel vector to the ABM net, kl is the number of neurons excited and k2 is the 
neurons of neurons grounded. A neural state vector can be constructed to represent such a 
configuration, which kl components being 1 and k2 components being 0. 

5. All such vectors together form a space, the connection space. A distance, either the 
Hausdorff distance or LI distance or L2 distance can be defined in this space. Such a 
definition of a distance allows all possible connection vectors to be classified via a 
distance from pi. Many vectors will be in a group with distance 1 from pi. Many vectors 
will be in a group with distance 2 from pi, ... 

6. The connection represented by pi is assigned the largest synaptic connection weight. 
Those connections in the distance 1 group will have smaller weights, .... After a certain 
distance, the connection weights will be 0, or there will be no connections. The present 
invention covers all possible combinations of such a generating method. 

7. The Markov chain is formed after the connections are established. 

8. For each connection, in addition to the synaptic connection weight, a mapping over each 
connection is established. Let kl be a number of neurons in the original kl order 
connection generated by pi, then this mapping maps from the kl neuron to the kl pixel 
value which excited these neurons. This completes the connection for the original 
segment pi. 

9. The segment, pi, also generated many other connections. If a neuron in this connection is 
one of the original kl neurons in pi, then this neuron is mapped into the corresponding 
pixel value, which causes this neuron to be excited; otherwise, this neurons is mapped 
into 0. This completes the mappings of all connections generated by this segment pi . 

The low-level recognition object is to provide all necessary functions used by the high-level 
recognition class. 

The high-level recognition class implements the ABM or APN algorithm. The "image Type" 
parameter in the user interface will determine which algorithm will be use. 

ABM Recognition Algorithm is: 

1 . An image to be classified is imposed on the Markov Chain. 

2. This Markov chain will settle on its invariant distribution. A distribution function is 
deployed to describe such a distribution. 

3. This distribution function, once obtained, can be used to classify images. This will 
produce triplets of image, class, and weight. Image retrieval and classification are two 
different sides of the same token. 
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4. These triplets of image, classification, and weight can be viewed as the results of the 
classification process. For the search process, a doublet of image and weight are 
displayed. The second part of the triple is omitted because the search problem has only 
one class. 

APN Recognition Algorithm is 

1 . An image to be classified is imposed on the Markov Chain. 

2. This chain will settle on its on its invariant distribution. A distribution function is 
deployed to describe such a distribution. 

3. This distribution function, once obtained, can be used classify images. This will produce 
triplets of image, class, and weight. 

4. Comparing the input-vector and the APN-connection-vector modifies this weight. All 
connection vectors together forms a vector space. A distance, either LI distance or L2 
distance can be defined in this space. The basic idea is the new weight will be directly 
proportional to the old weight and inversely proportional to this distance. The present 
invention covers all functions of obtaining the new weight: 

New weight = f (old weight, distance). 
This will produce a new set of triplets of image, classification, and weight. 

5. These triplets of image, classification, and weight can be viewed as the results of the 
classification process. For the search process, a doublet of image and weight are 
displayed. The second part of the triple is omitted because the search problem has only 
one class. 

IVI-API (Image Verification and Identification Application Programming 
Interface) 

A typical image matching application structure is: 

• GUI (graphical user interface) Layer 

• DBMS (database management system) Layer 

• IVI-API (image verification and identification API) Layer 

• SPI (Service Provider Interface) Layer 

• OS (Operating System) and Hardware Layer 

The IVI-API is transparent for SPI (Service Provider Interface): the SPI functions will pass right through 
the IVI-API. The SPI can be accessed directly from layers above the IVI-API layer, i.e. the DBMS layer 
or GUI layer. 

There are two main functions in API layer: verify and identify; and there is one main function in the SPI 
layer: capture. 

The two top-level jobs for verification are Enrollment and Verify. The two top-level jobs for 
identification are Enrollment and Identify. The enrollment, in either case, is nothing but setting a few 
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parameters; the IVI-API deals with the raw images directly. In this API, there is only one top-level 
function for verifications, Verify; and there is only one top-level function for identifications, Identify. 

This IVI-API does not have an enrollment process. The enrollment is replaced by setting two parameters: 

• The image in question; 

• The folder of previously stored images. 

This IVI-API does require an image storage structure that should be followed by the applications, so the 
folder of previously stored images can be passed to the verification and identification functions. 
Both the verification path and identification path are parameters, which can be changed by the parameter 
writer functions. The image in question can be stored anywhere in a hard drive. The previously stored 
images must follow the following structure: 

Verification 

The previously stored images must be stored at: 
M» verification pathtfDV 



Then the storage structure is: 

c:\Attrasoft\verification\12001\ginal.jpg 
c :\Attrasoft\verification\ 12001 \gina2 jpg 
c:\Attrasoft\verification\ 1 2002\tiffany 1 .jpg 
c:\Attrasoft\verification\12002\tiffany2.jpg 

Identification 

The folder of previously stored images must be stored at: 
identification path\ 

Example. Assume: 

1 . The identification path (a parameter) is: 
c:\Attrasoft\identification\ 




Example. Assume: 

1 . The verification path (a parameter) is: 

c:\Attrasoft\verification\ 

2. A set of doublets is: 



Image imagelD 

Ginal.jpg 12001 

Gina2.jpg 12001 

Tiffanyl.jpg 12002 

Tiffany2.jpg 12002 
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2. A set of doublets is: 

Image imagelD 

Ginal.jpg 12001 

Gina2.jpg 12001 

Tiffany 1 jpg 12002 

Tiffany2.jpg 12002 

If the number of images is less than 1000, then the storage structure is 



c:\Attrasoft\identification\ginal jpg 
c:\Attrasoft\identification\gina2.jpg 
c:\Attrasoft\identification\tiffany 1 jpg 
c:\Attrasoft\identification\tiffany2 jpg 



If the number of images is more than 1000, then the sub-directories should be used: 



^ c:\Attrasoft\identification\dir0000\ginal jpg 

p c:\Attrasoft\identification\dir0000\gina2 jpg 

Q c:\Attrasoft\identification\dir0000\tiffanyljpg 
Ki c:\Attrasoft\identiflcation\dir0000\tiffa^ 

m 

S Enrollment 



O The enrollment process builds the folder of previously stored images according to the above structure, 

ft! The folder of previously stored images will be a parameter for the AVI layer, called verification directory, 

m or identification directory or search directory. There will be a section to address the parameters later. 

P Because the enrollment means passing parameters, the enrollment is always 1 00%. 

ill 

1:N Matching 



The following methods (one main function and three result readers) are used to perform the Verification 
function: 

int verify(String image, long imagelD); 
longgetVerifylDO; 
String getVerifyNameO; 
long getVerifyWeightO; 



Atypical process is: 

• Initialize System 

• Capture image 

• Calculate the template 
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